Towards Extracting Domain Knowledge from C Code

نویسندگان

  • Mohammed Tarabain
  • Gunter Saake
چکیده

“Writing code is not the problem, understanding the code is the problem” this saying [7] summarizes how important is domain knowledge in software development and maintenance. Gathering this knowledge is an expensive process, which requires an investment of time, money, resources, and which is very demanding, because knowledge is scattered over various locations within source code. In this work, we propose a method to recover domain knowledge from C code using identifiers. To this end, we extract identifiers from source code, and we use them to generate domain concepts and investigate their interrelations. We describe four use cases, in which the domain concepts and their interrelations are typically used. To evaluate the performance of our approach, we conduct two experiments. Both experiments show promising result, in that our approach misses only few relevant concepts, and rarely generates irrelevant concepts. Currently, our approach is not fully automated, because the user has to traverse through a short list that contains both domain concepts and general concepts, and manually remove the general ones.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

Intentional meaning of programs

Software engineering is a quest for appropriate modeling and abstraction. Writing programs that simulate parts of the real world requires programmers to fill the conceptual gap between the domain knowledge and computer languages. As a consequence of the conceptual distance between the business domain and the general purpose programming languages, clearly identifiable concepts at the domain leve...

متن کامل

ارائه مدلی برای استخراج اطلاعات از مستندات متنی، مبتنی بر متن‌کاوی در حوزه یادگیری الکترونیکی

As computer networks become the backbones of science and economy, enormous quantities documents become available. So, for extracting useful information from textual data, text mining techniques have been used. Text Mining has become an important research area that discoveries unknown information, facts or new hypotheses by automatically extracting information from different written documents. T...

متن کامل

A Domain Independent Approach for Extracting Terms from Research Papers

We study the problem of extracting terms from research papers, which is an important step towards building knowledge graphs in research domain. Existing terminology extraction approaches are mostly domain dependent. They use domain specific linguistic rules, supervised machine learning techniques or a combination of the two to extract the terms. Using domain knowledge requires much human effort...

متن کامل

Query Architecture Expansion in Web Using Fuzzy Multi Domain Ontology

Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014